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To : Di.st r i bu t i on 

From: Robert S. Coren 

Date: 01/22/76 

Subject: Canon i ca I i 2a t i on of Terminal Input 



In theory* terminal input to Multics is converted by the 
ring-zero typewriter DIM to "canonical form"/ i. e.* the physical 
appearance of a line uniquely defines the form in which it will 
be stored. In addition/ well-defined meanings are attached to 
input streams containing erase* kill* and escape characters. 



In actual f 
goals described 
of canoni caliza 
are each handled 
does not lend 
c omb i na t i ons of 
three types ar 
final input resu 
"1Q.QQ"' "*025 M * 
i mplementa ti on. 



act* the current typewriter DIM does not meet the 
in the preceding paragraph. The three basic types 
tion (column assignment* erase/kill* and escape) 
more or less correctly* Put the current design 
itself to correct and consistent processing of 
ca noni ca I i za t i on types. The trouble is that the 
e handled more or less simultaneously. Thus the 
Iting from strings such as "\QZ7", "\Q)6tt7", 
etc.* is not predictable under the current 



A redesigned* more efficient version of tty^read is planned 
for Multics release 4.0* in the course of the new design* 
canon i ca I i zat i on will be cleaned up and made consistent. The 
details of this new design will be discussed in a future MTB; the 
purpose of the present document is to set forth a complete 
description of the rules of canon i ca I i z a t i on that the new 
tty^read will implement. It is proposed that the rules described 
here be adopted as a standard for all situations in Multics where 
canon i ca I i 2at i on is required. 



The three types of c anoni ca I i za t i on named above must be 
performed separately in a defined order* to ensure consistency 
and predictability. In particular* the canonicalization process 
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is conceptually divided into the following steps: 

} m If the terminal is in "can" mode* perform 
column-assignment canon i c a I i zat i on on the typed input. 

2. If the terminal is in "erkl" mode* perform erase/kill 
c a non i c a I i za t i on on the result of step 1. 

3. If the terminal is in "esc" mode* perform escape 
canonicalization on the result of step 2. 

Of course* the actual implementation does not necessarily have to 
perform the three steps in sequence* provided that the result is 
the same as would have been achieved by doing so. 

The three types of canoni cal i zat ion are discussed in more 
detail below, if two or more of the rules listed below are 
applicable to a given input string*- they are applied in the order 
in which they are presented here. 



COLUMN ASSIGNMENT 



This phase is concerned with determining which printing 
graphics* if any* appear in each physical column position. This 
is determined according to the following rules. 



S U i£S _ L 0. L _ Lh £ _ 1 Ql£ CO. C£ till 20.-0.1 -1 QfiUl _£tU£ i£ t S.L S 

1. The leftmost position of the carriage is considered to be 
column 1 „ 

2. Each printing graphic or space typed increases the column 
position by 1. 

3. Each backspace typed decreases the column position by 1 
unless the column position is 1, 

4. A carriage return sets the column position to 1, 

5. A horizontal tab increases the column position to the 
next tab stop; tab stops are defined to be at columns 11* 



Multics Project internal working documentation. Not to be 
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21/ 31 / etc 

6. A newl ine# form feed/ or vertical tab sets the column 
position to 1 and advances the carriage vertically; thus 
no character typed after such a character can share a 
column position with a character typed before it. 



Hyl£5_fQ£.ih£_£o.£rD^tiQQ_Qi_XJ3^£.jaflQi£ai-SiXin3 



7. Characters on each line are sorted so that their 
associated column positions are monotone increasing. 

8. No carriage return characters may appear in the canonical 
s t r i ng. 

9. A horizontal tab is preserved as typed unless a printing 
graphic appears in one of the columns skipped by the tab/ 
in which case the tab is replaced by an appropriate 
number of spaces. 

10. Backspaces appear in the canonical string only when two 
or more printing graphics share a column position. 

11. When two or more different printing graphics share a 
column position/ the characters are sorted as follows: 
graphic with lowest numeric ASCII code/ backspace/ 
graphic with next lowest numeric ASCII code/ etc. 

12. If the contents of a column position consist of two or 
more instances of the same printing graphic/ that column 
is reduced to a sir 
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13. A line- ending character (new line/ form feed/ or vertical 
tab) immediately fc 



ERASE AND KILL CHARACTERS 



The placement of erase/kill c a non i c a I i za t i on after 
column-assignment c anon i c a I i za t ion and before escape 
c anon i ca I i 2a t i on is strategic in that it causes erase/kill 
processing to work by £0.tu<B.Q QQSiiiQQ rather than by £.hata.Lt£L» 
This eliminates ambiguity with respect to erase characters 
combined with escape sequences. (See the examples at the end of 
this document.) 
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The rules for erase and kill c anon i ca I i z a t i on are given 

below. 



1 4. An erase character alone in a column position results in 
the deletion of itself and of the contents of the 
preceding column position. 

15. An erase character alone in a column position and 
preceded by more than one blank column results in the 
deletion of all immediately preceding blank columns* as 
well as of the erase character, 

16. An erase character sharing a column position with one or 
more printing graphics results in the deletion of the 
contents of that column position. 

17. A kill character results in the deletion of its own 
column position and all column positions to its left/ 
unless it shares a column position with an erase 
character, in which case rule 16 applies (the kill 
character is erased). 

18. If the terminal is in "esc" m o d" e * an erase or kill 
character alone in a column immediately preceded by an 
escape character alone in a column is not processed as an 
erase or kill character. 



Note that for rule 18 to apply* the erase or kill character must 
actually have been typed in the column immediately following the 
escape character. The reason for this is that it facilitates the 
erasing of escape sequences* e.g.* \UU1####. 



ESCAPE SEQUENCES 



The processing of escape sequences is performed according to 
the rules given below. 

19. An escape sequence consists of an escape character alone 
in its column position followed by one or more printing 
graphics each of which is alone in its column position. 
An escape sequence 'is replaced by a single character in 
the canonical string, 

20. An escape sequence consisting of two successive escape 
characters is replaced by an escape character. 
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21. An escape sequence consisting of an escape character 
followed by an erase (or kill) character is replaced by 
an erase (or kill) character, 

22. An escape sequence consisting of an escape character 
followed by one* two* or three octal digits is replaced 
by the character whose ASCII value is represented by the 
sequence of octal digits. 

23. An escape character followed by a newline character 
results in the deletion of both characters from the 
c anoni ca I string. 

2 A. Other escape sequences /nay be defined on a 
p e r-t er m i n a I - type basis* where such a sequence consists 
of an escape character and one character following. 

25. If the character following an escape character does not 
result in an escape sequence as defined by rules 20-24* 
the escape and following characters are stored as they 
appear on the tine* 



UASiEUS 

In the examples below* the following conventions are used: 



<NL> 


~ repres en t s 


a 


new I ine 


<CR> 


represent s 


a 


carriage return 


<BS> 


represents 


a 


backspace 


<HT> 


represents 


a 


horizontal tab 


<SP> 


represents 


a 


spa c e 


Cnnn} 


represents 
nnn (octal) 


a 


character whose 


\ 


is the escape 


character 


M 


is the erase 


charact er 


a 


i s the ki 11 


character 



Ihe examples in the first group illustrate how various typed 
sequences are canon i ca I i zed in terms of column position; these 
are followed by examples of erase* kill* and escape 
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canonical ization. In the second group/ lines are shown as they 
appear physically* with no consideration given to the precise 
sequence of keystrokes that might have produced them. 



COLUMN CANONICAL JZAT10N EXAMPLES 

Typed: Nothing special about this line.<NL> 

Appearance: Nothing special about this line. 

Result: Nothing special about this line.<Nl> 

lyoed: Extraneous white s<SP><BS>pace is i g no r ed . < CR ><SP>< NL > 

Appearance: Extraneous white space is ignored. 

Result: Extraneous white space is ignore d.<Nl> 

Typed: Two ways (2<BS>_) to over str i ke. <CR> <NL> 

Appearance: Iyo. ways to overstrike. 

Result: T<BS> <BS>w <BS>o ways (2<BS>_) to o ver s t r i k e. <NL> 



Typed: Tab + backspace i s<HT><BS>f educed to spaces. <NL> 

Appearance: Tab + backspace is reduced to spaces. 

Result: lab + backspace i s< SP><SP>< SP><S P>r educ ed to spaces. <NL> 
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ERASE-KILL AMD ESCAPE EXAMPLES 

Appearance: abz#cde 
Result: abcde 

Appearance: ab #cde 
Resu 1 1: abcde 

Appearance: NotSNever oSn Sunday, 
Result: Never on Sunday. 

Appearance: Nq£#u it's right. 
Result: Nqx it's right. 

Appearance: it's right. 

Result: N^jca it's right. 

(Erase character is over struck; see Rule 16.) 
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Appearance: del rrs char (1) static i n i t ( " \ 0 1 Pff 6" ) ; 
Result: del rrs char (1) static i ni t ("101 6) " ) ; 

Appearance: \022 
Result: <002>i 

(Over struck 3 is not part of escape sequence.) 

Appea r anc e : ^112 
Result: ^112 

(Overstruck \ is not an escape character.) 

L*ajB£le_ii 

Appearance: a \ # # b 
Result: a\b 

(First U is not an erase character by rule 18; second U erases 
itself and preceding U by rule 14.) 

£*a.nifil£_.14 (similar to fcxample 13) 

Appearance: a\d)#b 
Result: a\b 
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Appearance: aH$b 



Re su 1 1 : 



b 



(The \ is erased by the overstruck #.) 



Appearance: a\\#b 



Re su 1 1 : 



a\#b 



(Erase canonical izat ion does not recogni ze the ft by rule 18; 
escape canonical ization recognizes \\ by rule 20/ and attaches no 
special meaning to the #•) 



Appearance: a\\##b 
Result: a\b 

(By rule 18/ the first # is not an erase character; by rule 14/ 
the second U erases itself and the preceding #; IhSD rule 20 
reduces \ \ to \« ) 



Appearance: a\\###b 
Result: a\b 

(The first # is not an erase; the next two are/ erasing the 
second \ and the first #.) 
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Appearance: a\\####b 
Result: ab 

(The first # is not an erase* and must be erased before the two V 
character s . Examples 16-19 illustrate the difficulty of erasing a 
double \; the clearest method is probably to overstrike (alKHb).) 

££dJD£l£-2.0 Con 2741 — like terminal) 

Appearance: at<#b 
Re su I t : a\ b 

(Only the < is erased; t is translated to V.) 



